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defined by a portion of retina ESTs that is greater than 30% of the total. One of the 1241 
entries meeting these criteria, Hs. 60673, contained EST sequences from the 5'- and 3'- 
ends of two nearly identical cDNA clones isolated from the Soares retina N2b4HR cDNA 
library (ze39a04, ze32b03) (http://www.ncbi.nlm.nih.gov/ 
5 Genbank/GenbankOverview.html.) Reverse transcription (RT)-PCR using 

oligonucleotides (SEQ ID NOS.: 46-41) A128F (5'-CTC ACA TCC TTC TCA GCC-3') 
and A128R (5'-GTG GAA TGT CAG GGA AAT C-3'), priming to sequences in the 5' 
reads of the cDNA clones, amplified a 193 bp transcript in retinal RNA but not in various 
other adult human tissues tested. 

10 

Inspection of the sequence of genomic clone NH0309N08 (GenBank Acc. No. 
AC007279) harbouring EST sequences from Hs. 60673 revealed significant alignments 
with further ESTs derived from retina cDNA clones (ze27h05, ze30fl0, zf58a06, 
ys72e09). On the basis of this additional cDNA sequence information, oligonucleotide 

15 primers (SEP TP NOS.: 48-51) A128F3 (5*-TGA CTG CCT CCA GGA ATT-3'), 

A128aF (5'-TTA CGA AAT GAA TGG GCG-3'), A128aR (5'-AGG CTC TAG GTC 
CAT GAC-3') and A128R3 (5'-ATG TGA AAT CTG CGA AAG G-3') were designed 
and used to amplify retinal RNA in RT-PCR assays. The RT-PCR fragments were 
completely sequenced with walking primer technology on a ABI 310 automated sequencer 

20 (Perkin Elmer, Norwalk, USA) using the ABI PRISM Ready Reaction Sequencing Kit 

(Perkin Elmer, Norwalk, USA). Assembly of the overlapping 1375 bp A128F3/A128aR- 
and the 786 bp A128aF/R3-amplified cDNA fragments as well as 414 bp of 5' end 
sequence and 42 bp of the 3 ' end sequence of cDNA clone ze27h05 yielded a 2435 bp 
transcript with a conserved polyadenylation signal at nucleotide position 2416 bp. It 

25 should be noted that this full length transcript does not include the 5' end EST sequences 

of cDNA clones ze39a04 and ze32b03 (Hs. 60673) which most likely have been derived 
from incompletely spliced mRNA precursor molecules. 
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The full length 2435 bp cDNA contains an open reading frame (ORF) of 1980 bp with 
a first potential in frame translation initiation codon, ATG, starting 69 nucleotides 
downstream (see Seq. ID No. 1). Therefore, the protein predicted from the ORF consists 
of 637 amino acid residues, resulting in a calculated molecular mass of 72.8 kDa and an 
5 isoelectric point of 5.4. 

(B) Expression analysis 

RT-PCR analysis using oligonucleotide primers (SEQ TP NO.: 52) A128F4 (5'-CGT 
10 GCC ATG ACT GAG TAC-3 *) and A128aR (sequence described above) identified an 

844 bp product in human retina. No PCR amplification was observed in cerebellum, 
brain stem, liver, lung, heart, thymus, placenta, uterus, prostate, retinal pigment 
epithelium (rpe) and kidney. Northern blot analysis was performed with total RNA 
isolated using the guanidinium thiocyanate method (Chomczynski and Sacchi, 
15 Anal.Biochem. 162 (1987), 156-159). Each lane containing 10 mg of total RNA from 

temporal cortex, muscle, retina and liver was electrophoretically separated in the 
presence of formaldehyde. A 327 bp DNA fragment from the 3' untranslated region 
(UTR) was obtained by PCR amplification of genomic DNA with primer pair (SEQ TP 
NO. ; 53) A128F6 (5'-AAC TGC AGT GGG TAC CAG-3')/A126R6 (sequence described 
20 above) and was used as a probe for filter hybridization in 0.5 mM sodium phosphate 

buffer, pH 7.2; 7% SPS, 1 mM EPTA at 58°C (Church and Gilbert, PNAS USA 81 
(1984), 1991-1995). A single 3.8 kb transcript was identified exclusively in retina. The 
results of our expression analysis provide evidence that MPP4 is specific to the human 
retina. (Figure 1). 

25 



(C) Genomic organization and chromosomal location of MPP4 
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The publically accessible UniGene dataset, release no. 113, was searched for human EST 
clusters consisting of ESTs exclusively derived from retina cDNA libraries or for EST 
clusters with an enrichment of retina ESTs, defined by a portion of retina ESTs that is 
greater than 30% of the total. One of the 1241 entries meeting these criteria, Hs. 60473, 
5 contained approximately 350 bp of high quality EST sequences from the 3 '-ends of two 

cDNA clones (ze34f06, ze37g05) isolated from the Soares retina N2b4HR cDNA library. 
The approximately 280 bp high quality EST sequences of the 5 '-end of the cDNA clones 
available at the dbEST database (http://www2.ncbi.nlm.imi.gov/dbST/dbest_query.html) 
do not overlap with the corresponding 3 'end ESTs. 

10 

To isolate further cDNA clones representing this gene, a retina lambda-TriplEx2 cDNA 
library was screened with a radio-labeled 199 bp DNA fragment obtained by PCR 
amplification of genomic DNA with primers (SEQ ID NOS.: 54-55) A129F (5'-TCT 
GAG CCT AGA GGA TAC C-3') and A129R (5'-GAT CTC AGA GGC AGG TTG-3'). 
15 Fourteen positive clones with inserts ranging from 0.5 to 1.6 kb were isolated and 

sequenced with walking primer technology on an ABI 310 automated sequencer (Per kin 
Elmer, Norwalk, USA) using the ABI PRISM Ready Reaction Sequencing Kit (Perkin 
Elmer, Norwalk, USA) 



20 To isolate the complete 5 '-end of the cDNA the technique of 5 '-RACE (rapid 

amplification of cDNA ends) was used (Frohman et al. PNAS USA £5 (1988), 8998- 
9002). First strand cDNA synthesis was primed using the gene-specific antisense 
oligonucleotide A129R. Following cDNA synthesis, the first strand product was purified 
from unincorporated dNTPs and remaining primers A129R. A homopolymeric tail was 

25 then added to the 3' end of the cDNA usmg terminal deoxynucleotidyl transferase (TdT) 

and dCTP. PCR amplification was accomplished using Taq DNA polymerase, the nested 
gene-specific primer (SEQ ID NO.: 56) A129R5 (5'-TGC TGT GAA GAT TGG AGA 
TC -3') that anneals to a site located within the cDNA molecule, and a deoxyinosine- 
containing abridged 
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anchor primer (SEQ TP NO.:57) r AAP (5*-GGC CAC GCG TCG ACT AGT ACG GGI 
IGG Gil GGG IIG-3') provided by Life Technologies, Rockville, USA To increase the 
quantity of the specific cDNA product the original PCR was re-amplified using the 
abridged universal amplification primer (SE Q TP NO.: 58) , AUAP (5'-GGC CAC GCG 
5 TCG ACT AGT AC-3") provided by GIBCO Life Technologies, and a second nested 

gene-specific primer (SEQ IP NO.:59> A129R4 (5'- AGC TTG AAG TGG CTA AAG 
TC-3'). Sequencing of the obtained PCR product using primer A129R4 did not reveal 
further upstream sequence suggesting that the identified cPNA sequence encompasses the 
complete 5' sequences starting from the transcription start site of the transcript. 

10 

Assembly of the cPNA sequences yielded a 1190 bp cPNA sequence which contains an 
open reading frame (ORF) of 638 bp with a first potential in frame translation initiation 
codon, ATG, starting 47 nucleotides downstream (Seq. IP No. 26-28). The encoded 
putative protein consists of 196 amino acid residues and has a calculated molecular mass 
15 of 22.3 kPa and an isoelectric point of 9.26. 



Comparison of 14 different cPNA sequences revealed the presence of a single nucleotide 
polymorphism (C/G) at position 143 bp causing the amino acid substitution isoleucine to 
methionine at codon 32 of the putative protein sequence. 

20 

(B) Expression analysis 



Reverse transcription-PCR analysis using oligonucleotide primer pairs A129F/A129R and 
A129F3 (SEQ IP NO :60^ (5'-TGA TCT CCA ATC TTC ACA GC-3')/A129R 
25 identified a specific 199 bp and 244 bp cPNA fragment in human retina only (Figure 2). 

No PCR amplification was observed in human cerebellum, liver, lung, heart, placenta, 
thymus and kidney. Northern blot analysis 
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was performed as was performed as described in Example 1 . A 244 bp cDNA fragment 
from the 5' region was used as a probe for filter hybridization in 0.5 mM sodium 
phosphate buffer, pH 7.2; 7% SDS, 1 mM EDTA at 58°C. Two transcripts of about 0.85 
and 1.20 kb were identified exclusively in retina (Figure 2). 

5 

(C) Genomic organization and chromosomal location of C7orf9 



To determine the exon/intron structure of C7orf9, the 1190 bp cDNA sequence was 
aligned to the complete sequence of genomic BAC clone CTB-136N17 (GenBank Acc. 
10 No. AC004129) using the BLASTN program at NCBI. A total of 3 exons were identified 

with the putative translation start codon ATG located in exon 1 and the termination codon 
TAA in exon 3 (Seq. ID No. 26-28). 



This genomic sequence of BAC clone CTB-136N17 contains DNA marker stSG51683 
15 which has been mapped to the D7S2493-D7S529 interval on chromosome 7pl5-p21 by 

screening the Genebridge4 radiation hybrid panel 
(http : //www . ncbi . nlm . nih . gov/genome/seq) . 



(D) Nucleotide and protein database analyses 

20 

The cDNA sequence of C7orf9 was subjected to homology searches using the BLASTN 
program at Baylor College of Medicine (BCM)and revealed 100 % sequence identity 
between the coding region of C7orf9 and the human mRNA for RFamide-related peptide 
precursor (GenBank accession number AB040290). Therefore, the putative translation 
25 product of C7orf9 is identical to the RFamide-related peptide precursor (GenBank 

accession number BAB17674). The analysis for specific motifs using the integration tool 
for the signature-recognition methods in InterPro at the European Bioinformatics 
Institute, revealed that amino acids 99 
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screening and excised as plasmids from the phage vector following the instructions of the 
SMART* library kit manual (Clontech, Palo Alto, USA). In the case of the lambda-gtlO 
(SF. Q TP NO.: 631 cDNA library, one clone was isolated by PCR amplification. Primers 
A071F (described above) and lambda-gtlOF (5'-AGC AAG TTC AGC CTG GTT AAG- 

5 3') were used to amplify the clone from a mixed phage lysate containing the positive 

clone. Additionally, 750 bp of F379 cDNA was amplified from retina cDNA using 
primer pair A071F (described above) and A071R2 (SEQ TP NO.; 64) (5'- ATG TTC 
AGT CAG GCA GGG -3'). All cDNA library clones and PCR products were sequenced 
using the ABI PRISM Ready Reaction Sequencing Kit on an ABI 310 automated 

10 sequencer (Perkin Elmer, Norwalk, USA). 



The 1188 bp full length consensus cDNA sequence of F379 (Seq.ID No. 7) was 
determined from a compilation of the DNA sequences from the cDNA library clones, the 
PCR products and the ESTs of Hs. 35493. An alignment of these sequences to the 

15 consensus cDNA sequence of F379 revealed that there were single base pair variations. 

These single base pair changes are summarized in Table 1. The full length consensus 
cDNA contained a putative open reading frame (ORF) of 85 amino acids (Seq. ID No. 
31), starting at 347 bases from the most 5' end of the full length consensus cDNA. The 
single base changes in the cDNA do not truncate the putative ORF by introducing a stop 

20 codon; rather, the variations cause amino acid substitutions or have no effect on the 

putative ORF (Table 1). The ORF contains Alu and MIR repetitive elements, which 
together account for 68 amino acids. The predicted protein has a calculated molecular 
mass of 9.2 KDa and an isoelectric point of 6.81 . 
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Table 1: Single base variations in the cDNA sequence and their associated amino acid 
changes 



Position from 


Nucleotide 


Amino Acid 


beginning of 


Change 


Change 


cDNA 






325 


G 


n/a* 


429 


T 


L 


442 


A _j 


R 


528 


T 


I 


557 


T 


S 


932 


A 


n/a* 


971 


C 


n/a* 


987 


T 


n/a* 



* single base pair variation is located outside of putative ORF 



(B) Expression analysis 

Reverse transcription-polymerase chain reaction (RT-PCR) using oligonucleotides A071F 
and A071R, priming to sequences in the 5' reads of the cDNA clones, amplified a 328 
bp transcript from human retina RNA but not from uterus, cerebellum, heart, liver or 
lung RNA. Furthermore, Northern blot analysis was performed as described in Example 
1. A 219 bp DNA fragment from the 3' region of the gene was obtained by PCR 
amplification of genomic DNA with primer pair A071F3 (SEP TP NO ?65) (5'- TTC 
TTG TCG GAT GCC CTC -3') and A071R2 (described above). This DNA fragment was 
used as a probe for filter hybridization in 0.5 mM sodium phosphate buffer, pH 7.2; 7% 
SDS, 1 mM EDTA at 58°C. A single transcript of about 1.1 kb was identified only in 
retina. The 
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results of the expression analysis show that F379 is found exclusively in retina (Figure 
3). Furthermore, the size of the transcript detected by Northern blot correlates to the size 
of the full length cDNA consensus sequence (1188 bp). 

5 (C) Genomic organization and chromosomal location of F379 

To determine the exon/intron structure of F379, the 1188 bp consensus cDNA sequence 
was aligned to the finished and unfinished genomic sequences using the BLASTN 
program at NCBI. The complete cDNA sequence of F379 aligned to genomic clones from 

10 different chromosomes, including chromosome 19 (LLNLR-222A1), chromosome 22 

(RP11-395L14), chromosome 2 (RP11-559H14), chromosome 21 (RP11-34P13), 
chromosome 10 (RP11-438F6), chromosome 12 (RP11-598F7), and chromosome 9 
(RP11-142M1). Partial alignments were also found to genomic clones from chromosome 
15 (15qtel_cl84at3), chromosome 12 (12PTEL057, 12PTEL055, RPCI11-55L14) and 

15 chromosome 19 (CTD-2102P23). These alignments identified three exons ranging from 

205 bp to 621 bp. The putative translation start codon ATG is located in exon 1 and the 
termination codon TGA is located in exon 3. 

PCR-based screening of two different human/rodent somatic cell hybrid DNA mapping 
20 panels also indicated the multicopy nature of F379. A commercial human/rodent somatic 

cell hybrid mapping panel (Mapping Panel 2 from Coriell Institute for Medical Research, 
Camden, USA) was screened with primer set A071F (described above) and A071R 
(described above), yielding a 328 bp product in cell line DNA containing chromosomes 
2, 3, 6, 9, 12, 15, 19, and 20. Based on this result, gene names D2F379S1E, 
25 D3F379S2E, D6F379S3E, D9F379S4E, D12F379S5E, D15F379S6E, D19F379S7E, and 

D20F379S8E were assigned to chromosomes 2, 3, 6, 9, 12, 15, 19, and 20, respectively 
by the Genome Database (http://www.gdb.org/). The multi-chromosomal location of 
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Example 4: C12orf7 

(A) Isolation of C12orf7 cDNA 

The publicly accessible UniGene dataset, release no. 113, was searched for human EST 
clusters consisting of ESTs exclusively derived from retina cDNA libraries or for EST 
clusters with an enrichment of retina ESTs, defined by a portion of retina ESTs that is 
greater than 30% of the total. One of the 1241 entries meeting these criteria, Hs.28411, 
contained 10 EST sequences. Eight ESTs represent the 5'- and 3'-ends of four cDNA 
clones isolated from the Soares retina N2b4HR cDNA library (zf50g06, ze44g08, 
yt72c07, zf52h05) and two represent the 3 '-ends of two cDNA clones isolated from the 
Soares placenta Nb2HP cDNA library (yi08f03.sl, yi75a07.sl). 

To identify the full length cDNA transcript of C12orf7, a lambda-gtlO retina cDNA 
library was probed with a alpha 32 P-dCTP-labeled 863 bp fragment obtained by PCR 
amplification of cDNA clone zf50g06 using primer pair (SEP TP NPS -66-67) A038F3 
(5'-CGG AAC CGC TGT GAG TGC-3') and A038F (5'-TAG GCA GAG GTG GAT 
GGG-3'). The inserts of eleven positive clones were sequenced with walking primer 
technology using the ABI PRISM Ready Reaction Sequencing Kit on an ABI 310 
automated sequencer (Perkin Elmer, Norwalk, USA). 

Compilation of the 1 1 cDNA sequences revealed two different cDNA species. One cDNA 
molecule consists of 1428 bp, the second cDNA sequence contains an insertion of 30 bp 
at nucleotide position 549. To isolate the complete 5'-end of the cDNA the technique of 
5 '-RACE (rapid amplification of cDNA ends) was used as described in Example 2 except 
that first strand cDNA synthesis was primed with the gene-specific antisense 
oligonucleotide A038F and PCR amplification was accomplished using the gene-specific 
primer (Seq ID No.: 68) A038R3 (5'-GGC 



-46- 

CAC TCG GGC TTG TAG-3') and a second nested gene-specific primer (SEQ TP 
MLm A038R4 (5'-GTG CAA TGC CAG CTC TTC-3'). Sequencing of the obtained 
PCR product using primer A038R4 revealed an additional 86 bp of 5' sequence. 
Assembly of the 5'-RACE sequence and the cDNA sequences obtained from the cDNA 
clones yielded a 1514 (Seq. ID No. 35) and a 1544 bp transcript (Seq. ID No. 36). 

Comparison of the cDNA sequences revealed the presence of two single nucleotide 
polymorphisms at position 40 bp (A/T) and 88 bp (C/T) of Seq. ID No. 35 and 36. 

Both cDNA variants contain the same putative open reading frame (ORF) encoding a 345 
amino acid (aa) (Seq. ID No. 37) and a 355 aa (Seq. ID No. 38) protein. The putative 
proteins share the same potential in frame initiation codon, ATG, located 154 nucleotides 
downstream of the most 5' cDNA sequence. The putative protein sequences No. 11a and 
No. lib have a calculated molecular mass of 37.1 kD and 38.0 kD and an isoelectric 
point of 5.59 and 5.49, respectively. 

(B) Expression analysis 

Reverse transcription-PCR using oligonucleotides A038F and A038R rSEO ID NO :70) 
(5'-TGC CAA GCT GTT AGT GCC-3'), priming to the 3' end of the cDNA sequence, 
amplified a 231 bp cDNA fragment from human retina RNA but not from human brain, 
heart, liver, lung or uterus RNA. RT-PCR using primers A038F4 CSEQ ID NO -71) (5'- 
CAT GCT ACC ACG GCT TCC-3') and A038R3 amplified a 379 bp and 409 bp 
fragment from human retina RNA but not from human cerebellum, heart, kidney, liver, 
lung, placenta or thymus RNA (example in Figure 4). 



